Section One: Understanding Price per Square Foot relationships in Philadelphia (The Kitchen Sink)

Scatter Plot Relationships

Plot 1

Note on Figure One

  • There seems to be a loess relationship between these two variables.

Plot 2

Note on Figure Two

  • There seems to be a an intensely positive and steep relationship between price per square foot and distance to crime. Most points fall close to crime, which speaks to the density of Philadelphia.

Plot 3

Note on Figure Three

  • There does not seem to be any linear relationship between distance to parks and price per square foot.

Plot 4

Note on Figure Four

  • It looks like most landlords live very close to their properties, but besides that there is not one single relationship between the two variables.

Test for Significance, Normality, and Skedasticity

Significance

## 
## Call:
## lm(formula = log(hed$inf_prc_ft) ~ hed$d_septa + hed$d_cbd + 
##     hed$d_business + hed$d_h_ramps + hed$d_walk + hed$d_abate + 
##     hed$d_off_site + hed$pct_non_wh + hed$P_ADV_MA_1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.6977 -0.3089  0.1420  0.5542  2.8742 
## 
## Coefficients:
##                     Estimate    Std. Error t value             Pr(>|t|)
## (Intercept)     4.4379823096  0.0323411927 137.224 < 0.0000000000000002
## hed$d_septa     0.0000167068  0.0000017512   9.540 < 0.0000000000000002
## hed$d_cbd      -0.0000117381  0.0000009488 -12.372 < 0.0000000000000002
## hed$d_business -0.0000663426  0.0000108620  -6.108        0.00000000103
## hed$d_h_ramps   0.0000279306  0.0000018793  14.862 < 0.0000000000000002
## hed$d_walk      0.0005887736  0.0001020989   5.767        0.00000000819
## hed$d_abate    -0.0000138751  0.0000024342  -5.700        0.00000001212
## hed$d_off_site  0.0008007638  0.0000434711  18.421 < 0.0000000000000002
## hed$pct_non_wh -1.2722839257  0.0223005683 -57.052 < 0.0000000000000002
## hed$P_ADV_MA_1  0.0061787121  0.0004757662  12.987 < 0.0000000000000002
##                   
## (Intercept)    ***
## hed$d_septa    ***
## hed$d_cbd      ***
## hed$d_business ***
## hed$d_h_ramps  ***
## hed$d_walk     ***
## hed$d_abate    ***
## hed$d_off_site ***
## hed$pct_non_wh ***
## hed$P_ADV_MA_1 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9148 on 23248 degrees of freedom
## Multiple R-squared:  0.3109, Adjusted R-squared:  0.3106 
## F-statistic:  1165 on 9 and 23248 DF,  p-value: < 0.00000000000000022

Note on the SEPTA Effect

  • A one foot change in the distance to a SEPTA stop has a 0.0016707% impact on home value in Philadelphia.
  • All of the coefficients selected in the regression equation are significant.
  • I did a test for collinearity by creating a correlation matrix and did not find that any of the coefficients were collinear (|correlation|>.8).

Normality

Note on Normality

  • There is a bell curve made by making a histogram of the residuals, so this means there is normality.

Skedasticity

Note on Skedasticity

  • Figure Six does not seem to look like “random noise”, meaning it is heteroskedastic rather than homoskedastic.

Spatial Autocorrelation

Map of Residuals

Moran’s I

## 
##  Moran I test under randomisation
## 
## data:  regA$residuals  
## weights: nb2listw(spatialWeights, style = "W")  
## 
## Moran I statistic standard deviate = 75.899, p-value <
## 0.00000000000000022
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##     0.32925793172    -0.00004299781     0.00001882400

Note on Moran’s I and Spatial Autocorrelation

  • Moran’s I of .329 and its p-value of <.05 shows that there is significant spatial autocorrelation of clusters (as opposed to a negative I, which would be a spatial autocorrelation of dispersal).
  • This model explains 31.06% of the variance in the logged price per square foot. This does not explain much as to why price per square foot varies.
  • The especially small SEPTA coefficient shows that this model does not necessarily demonstrate the willingness to pay for transit.
  • This model could be improved with other coefficients, like distance to Regional Rail lines, Bus stops, or major employers like Penn and Jefferson.

Section Two: The Transity Discontinuity

Discontinuity Plots

Discontinuity Base Map

Discontinuty Plot

Note on Discontinuity Plot

  • This plot shows that there is less willingness to pay for access to transit between 1/2 and 1/4 mile distance (the control group), but then there is a price jump once houses are just inside the 1/4 mile boundary.
  • The regression line trends downward the closer houses are to subway lines, regardless of being in the control or treatment group.
  • People are not that willing to pay for transit.

Graduated Symbol Map

Note on Graduated Symbol Map

  • The map is telling us that some transit stops are associated with a price per square foot premium in the quarter mile boundary compared to quarter-to-half mile boundary, while others depreciate the price.
  • It seems that stations in Center City and South Philly have the largest premium, while proximity to Northeast Philly stations hurt price per square foot.

Regression Results of Discontinuity

Stargazer Regression

Dependent variable:
log(inf_prc_ft)
Just the fixed effect w/Station fixed effects w/ other variables
(1) (2) (3)
15TH STREET 0.2983
2ND STREET 0.1147
30TH STREET 0.1679
34TH STREET -0.5911
40TH STREET -1.5877***
46TH STREET -1.3493***
52ND STREET -2.2064***
56TH STREET -2.2627***
5TH STREET 0.2402
60TH STREET -2.3030***
63RD STREET -1.9060***
ALLEGHENY -2.1575***
BERKS -1.3154***
CECIL B MOORE -2.0553***
CHURCH -1.8937***
ELLSWORTH-FEDERAL -1.0866***
ERIE -2.4807***
ERIE-TORRESDALE -1.4154***
FAIRMOUNT -1.1919***
FERN ROCK TRANSPORTATION CENTER -1.4779***
FRANKFROD TRANSPORTATION CENTER -1.3789***
HUNTING PARK -2.5144***
HUNTINGDON -2.5218***
LOGAN -1.9561***
LOMBARD-SOUTH -0.0812
MARGARET-ORTHODOX -1.8744***
NORTH PHILADELPHIA -3.0459***
OLNEY -1.5325***
OREGON -0.6467*
PATTISON -0.1022
RACE-VINE 0.0988
SNYDER -0.9134**
SOMERSET -2.5338***
SPRING GARDEN -0.2758
SPRING GARDEN (BROAD STREET) -0.2805
SUSQUEHANNA-DAUPHIN -2.7773***
TASKER-MORRIS -1.2568***
TIOGA -1.8578***
WALNUT-LOCUST 0.2096
WYOMING -2.3517***
YORK-DAUPHIN -2.4767***
lt_qrtMi -0.0879*** 0.0163 -0.1011***
d_cbd -0.00001***
d_business -0.0003***
d_h_ramps -0.00005***
d_walk 0.0010***
d_abate -0.00002***
d_off_site 0.0021***
pct_non_wh -1.0058***
P_ADV_MA_1 0.0057***
Constant 3.8867*** 5.5343*** 4.7401***
Observations 6,612 6,612 6,612
R2 0.0012 0.3790 0.2922
Adjusted R2 0.0011 0.3750 0.2913
Residual Std. Error 1.2320 (df = 6610) 0.9745 (df = 6569) 1.0377 (df = 6602)
F Statistic 8.0760*** (df = 1; 6610) 95.4502*** (df = 42; 6569) 302.8617*** (df = 9; 6602)
Note: *p<0.1; **p<0.05; ***p<0.01

Note on Stargazer Regression

This table shows that each regression model shows a different williness to pay for living inside versus outside the quarter mile transit boundary. In Regression One (reg1), there is a -8.42% difference between price per square foot inside and outside the boundary, assuming all else equal. In Regression Three (reg3), there is a -9.62% difference between price per square foot inside and outside the boundary, assuming all else equal. In Regression Three (reg2), the lt_qrtMi coefficient is not significant and therefore has no real effect on price per square foot.

The station effects in Regression Two (reg2) helps us understand which stations have significant effect on home prices, be it positive or negative, and which ones don’t. Values that have two or three astericks next to them are significant and indicate their effect on logged price per square foot value, all else equal. Interestingly, many of the UCity/Center City stops did not have a significant effect on house prices.

I do not believe that this research design shows us the willingess to pay for transit. The best-fitted regression, Regression Two (reg2), only explains 37.5% of the variance of home prices in relationship to its predictors, which are the stations and the quarter mile boundary. Adding more predictors related to transit could improve the understanding of the willingness to pay for transit.

Regression One Residual Map

Regression Two Residual Map

Regression Three Residual Map